Genomic correlates of programmed cell death ligand 1 (PD-L1) expression in Chinese lung adenocarcinoma patients

Background Although PD-L1 expression is a crucial predictive biomarker for immunotherapy, it can be influenced by many factors. Methods A total of 248 Chinese patients with lung adenocarcinoma was retrospectively identified. Data for clinical features, gene alternations, signaling pathways and immune signatures was analyzed among negative expression group (TPS < 1%, n = 124), intermediate expression group (1% ≤ TPS < 50%, n = 93), and high expression group (TPS ≥ 50%, n = 38). Clinical outcomes among different expression groups were also evaluated from public database. Results Firstly, high tumor mutation burden was significantly associated with high PD-L1 expression in these Chinese patients with lung adenocarcinoma. In addition, gene alternations including TP53, PRKDC, KMT2D, TET1 and SETD2 apparently occurred in high PD-L1 expression group. Moreover, pathway analysis showed that mutations involving in DDR pathway, TP53 pathway, cell-cycle pathway and NOTCH pathway were obviously varied among three PD-L1 expression groups. Besides, most of patients in high PD-L1 expression group from TCGA database were determined as high-grade immune subtypes (C2-C4), showing significant higher proportions of IFN-gamma, CD8+ T-cells, NK cells, NK CD56 dim cells, Th1 cells, Th2 cells (P < 0.0001). Moreover, SETD2 mutation slightly correlated with overall survival from MSKCC cohort (HR 1.92 [95%CI 0.90–4.10], P = 0.085), and the percentage of IFN-gamma was significantly higher in SETD2 mutant group than in wild-type group (P < 0.01). Conclusions This study illustrated in-depth genomic correlates of PD-L1 expression in Chinese lung adenocarcinoma patients and relevant immune signatures from public database, which might interpret more potential molecular mechanisms for immunotherapy in NSCLC.

T lymphocyte antigen-4 (CTLA-4) antibodies, have recently revolutionized the treatments for NSCLC and have emerged as promising therapeutic strategies for NSCLC patients [6,7].
As there were still a certain number of patients who cannot benefit from ICBs, predictive biomarkers for clinical responses to the immunotherapies have provided clinical assistances for clinicians in early selection of those responders and timely implementation of therapeutic regimens [6,7]. For example, some studies have demonstrated that positive programmed death-ligand 1 (PD-L1) expression level significantly correlated with an improved response in NSCLC [8,9]. Based on the results of KEYNOTE-158 clinical trial, pembrolizumab has been approved by FDA as the front-line therapy for advanced lung cancer patients who present high PD-L1 expressions (TPS > 50%) and who are diagnosed as EGFR or ALK wild-type [10]. In KEYNOTE-042 clinical trial, front-line pembrolizumab therapy for metastatic NSCLC patients who have positive PD-L1 expression (TPS ≥ 1%) presents better clinical outcomes compared with platinum-based chemotherapy [11].
With advances of next-generation sequencing (NGS) techniques, we retrospectively conducted a in-depth analysis to characterize the factors associated with PD-L1 expression in Chinese lung adenocarcinoma patients. This study might help with illustrating potential molecular mechanisms of immunotherapy in NSCLC.

Study design
Patients with lung adenocarcinoma who were received anti-cancer treatments in our hospital from January 2019 to May 2020 was retrospectively identified and relevant clinical data were collected. Formalin-fixed paraffinembedded (FFPE) tumor tissue or fresh tissue for each patient were either taken from a biopsy or surgery for PD-L1 expression assay and genomic profiling using NGS panel (YuceOneTM Plus , Yucebio, China).

PD-L1 immunohistochemistry
The Dako PD-L1 IHC 22C3 pharmDx assay was used to detect PD-L1 protein expression in FFPE slides according to the manufacturer's recommendations. PD-L1 expression was calculated using tumor proportion score (TPS) according to the percentage of tumor cells with complete or partial membrane staining (central or marginal tumor region). Then, patients were divided as "negative" expression group (TPS < 1%), "intermediate" expression group (1% ≤ TPS < 50%), and "high" expression group (TPS ≥ 50%).

Next generation sequencing and mutation analysis
Genomic profiling was performed on tumor tissue and matched peripheral blood samples. Genomic DNAs were isolated from tumor specimens and blood, and extracted using the GeneRead DNA FFPE Kit (Qiagen) and Qiagen DNA blood mini kit (Qiagen). Then, extracted DNAs were amplified, purified, and analyzed using NGS panel (YuceOne ™ Plus, Yucebio, China).
Sequencing reads with > 10% N rate and/or > 10% bases with quality score < 20 were filtered using SOAPnuke (Version 1.5.6). The somatic single nucleotide variants (SNVs) and insertions and deletions (InDels) were detected using VarScan (Version 2.4), and further inhouse method was applied to filter the possible false positive mutations. Then, SnpEff (Version 4.3) was used to perform functional annotation on the mutations detected in the tumor sample. Tumor mutation burden (TMB) was calculated using non-silent somatic mutations, including coding base substitution and indels.
HLA typing of tumor and matched control samples were assessed by OptiType (Version 1.3.2). The loss of heterogeneity (LOH) of HLA were detected by LOH HLA [22]. The neoantigen prediction was performed as previously described [23]. Tumor neoantigen burden (TNB) was measured as the number of mutations which could generate neoantigens per megabase.

Copy number variations analysis
Somatic copy number alterations (SCNAs) analysis was performed using Allele-Specific Copy number Analysis of Tumors (ASCAT) with default parameters and FACETS algorithm. Then GISTIC2.0 was used to identify significant driver somatic CNVs by evaluating the frequencies and amplitudes of observed events. Chromosomal instability (CIN) was estimated using the weighted chromosomal instability (wCIN) score, which defined as the average of this percentage value over the 22 autosomal chromosomes [24].

Pathways and immune signatures analysis
Genes in pathways analysis were compared with previously reported gene list [25,26] and overlapping genes covered in the YuceOne ™ Plus panel. Additionally, proportions of IFN-gamma signature and infiltrating immune cells were analyzed according to previous studies [27,28]. The immune signature scores were calculated using ssGSEA method implemented by R package GSVA [29].

Statistical analysis
Correlations between PD-L1 expression and clinical parameters were analyzed using the Fisher's exact test for categorical variables. Kruskal-Wallis rank sum tests were used for comparisons of continuous variables across multiple groups. Wilcox rank sum tests were used for comparisons of continuous variables between two groups. Multiple comparison corrections were used to calculate Q values by the FDR correction. Survival analysis was performed using Kaplan-Meier survival plot and logrank test p value was calculated. P < 0.05 or Q < 0.25 were considered statistically significant. All statistical analyses were performed in the R Statistical Computing environment v3.6.1 (http:// www.r-proje ct. org).

General clinical and mutational characteristics in Chinese lung adenocarcinoma patients
As shown in Table 1, a total of 248 Chinese lung adenocarcinoma patients were identified and included in this study. According to the results of PD-L1 expression essay, these patients were divided into three group, negative PD-L1 expression group with a TPS < 1% (n = 124, 50%), intermediate PD-L1 expression group with a TPS 1%-49% (n = 93, 38%), and high PD-L1 expression group with a TPS ≥ 50% (n = 38, 12%) ( Table 1).
The median age and gender proportion was very similar among the three PD-L1 expression group, implying that PD-L1 expression level was not affected by either age or gender. The median TMB in PD-L1 high expression group was significantly higher than those values in intermediate or negative expression group [median (interquartile range) 6

Significant genomic mutations associated with PD-L1 expression in Chinese lung adenocarcinoma patients
The 25 most frequently genomic alternations, such as high oncogenic amplifications or mutations and deep deletions in tumor suppressors, were listed in Fig. 1

Key signaling pathways related with PD-L1 expression in Chinese lung adenocarcinoma patients
Next, we also performed further analyses in oncogenomic pathways in this study (Fig. 2). Alterations involved with the DNA damage response (DDR)  (Fig. 3)

Major immune signatures linked to PD-L1 expression in lung adenocarcinoma patients from TCGA database
Based on TCGA-LUAD database, we primarily characterized immune signatures among different PD-L1 expression groups in Fig. 4 a-f and found that most of patients in high PD-L1 group were determined as highgrade immune subtypes (C2-C4). Compared with PD-L1 negative expression group, higher proportions of IFNgamma, CD8+ T cells, NK cells, NK CD56 dim cells, Th1 cells, Th2 cells (P < 0.0001) and lower percentage of NK CD56 bright cells and Th17 cells (P < 0.05) was observed in PD-L1 high expression group, supporting that high CNVs, copy number variations. *P < 0.05, **P < 0.01, ***P < 0.001 PD-L1 expression level can be a prognostic marker for anti-cancer immunotherapy.

Potential therapeutic response correlated to SETD2 mutation from public cohort
As shown in Fig. 4g

Discussion
Taking consideration of some published clinical trials, PD-L1 expression can help direct clinicians to choose single-agent immunotherapy for NSCLC patients with high PD-L1 expressions or combined chemo-immunotherapy for NSCLC patients with low PD-L1 expressions. But, due to constantly emerging of converse results, prognostic value of PD-L1 expression for ICBs was recently challenged [8,9]. Except for the variabilities in immunohistochemical staining antibodies and heterogeneous expressions in different tumor site, PD-L1 expression has been found to be influenced by some extrinsic or intrinsic factors in NSCLC. In this study, we conducted a in-depth analysis in order to reveal latent gemoic or clinical correlates associated with PD-L1 expression in Chinese lung adenocarcinoma patients. In this retrospectively study, clinical features such as age and gender cannot affect PD-L1 expression in lung adenocarcinoma. High TMB levels were significantly as associated with high PD-L1 expression in lung adenocarcinoma (P < 0.05), which was consistent with those findings from multicenter studies [14,15].
It is generally acceptable that patients from different ethnic groups have unique clinical features and oncogenic mutations in different cancers. Although some studies highlighted the molecular associations between genomic alternations of TP53, KRAS, EGFR and PD-L1 expression [13][14][15][16], similar studies focusing on Asian population are still very limited. Based on 15-gene NGS panel testing, Liu et al. found that EGFR mutations were more common in PD-L1 negative expression group (TPS < 1%), ALK mutations were more common in PD-L1 intermediate group (TPS 1%-49%), and BRAF and MET mutations were more common in PD-L1 high group ( TPS ≥ 50%) in Chinese lung cancer patients [30]. In addition to these common gene mutations, we revealed the obvious occurrences of genetic alternations in TP53, PRKDC, KMT2D, Fig. 3 Percentages of mutated genes in DDR pathways among different PD-L1 expression groups. DDR, DNA damage response; CPF, check point factors; MMR, mismatch repair; NHEJ, nonhomologous end-joining; FA, Fanconi anemia; HRR, homologous recombination repair; NER, nucleotide excision repair; BER, base excision repair. *P < 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001 TET1 and SETD2 for high PD-L1 expression in Chinese lung adenocarcinoma patients. Similarly, it is recently reported that 75% of mutant PRKDC patients with lung cancers can response to immunotherapy, suggesting PRKDC can be explored as both a predictive biomarker and a therapeutic target for ICBs [31]. These results may enrich the mutational spectrum associated with PD-L1 expression in Chinese lung adenocarcinoma patients, and provide potential therapeutical target for immunotherapy in NSCLC.
Besides, activating signaling pathways like PI3K-AKT-mTOR pathway, JAK-STAT pathway and KRAS-ERK pathway, can regulate PD-L1 expression in many cancer types [17][18][19]. Recently, NSCLC patients with driver gene mutation in DDR pathways presented significant higher TMB values and higher objective response rate, longer median PFS after anti-cancer immunotherapy [32].We also found gene alternations for DDR pathway, TP53 pathway, cell cycles pathway and NOTCH pathway apparently happed in high PD-L1 expression patients (P < 0.05), which might provide more evidences for illustrating molecular mechanism involving with PD-L1 expression in NSCLC.
Due to the complexity of tumor immunity mechanisms, analyzing TILs in tumor microenvironments might be important for indicating tumor immunogenicity and predicting immunotherapy efficacy. Patients who were diagnosed as immune type I refer to those with high The normalized ssGSEA scores of i) IFN-gamma and j) CD8+ T cells in the groups with or without SETD2 mutations. LUAD, lung adenocarcinoma; IFN, interferon;NK cells, natural killer cells; Th cells, T helper cells; PFS, progression-free survival; OS, overall survival. 'ns' , not significant. *P < 0.05, **P < 0.01, ****P < 0.0001 PD-L1 expression and CD8+ TLs in the tumor microenvironment, and most of these patients can benefit from ICIs [33,34]. Also, these patients are likely to associate with increased numbers of somatic driver mutations or tumor neoantigen, and positive infection with Epstein-Barr virus, etc. [33,34]. Likely, we primarily characterized immune signatures with PD-L1 expression in patients with lung adenocarcinoma from TCGA-LUAD database and found that the percentage of high-grade immune subtypes (C3-C5) in PD-L1 high group was higher than PD-L1 low group. Significant higher proportions of IFN-gamma, CD8+ T-cells, NK cells, NK CD56 dim cells, Th1 cells, Th2 cells were found in PD-L1 high group (P < 0.0001), whereas substantial lower percentage of NK CD56 bright cells and Th17 cells was observed (P < 0.05). Then, we found SETD2 mutation were slight positive correlated with overall survival from MSKCC cohort (HR 1.92 [95%CI 0.90-4.10], P = 0.085), and the percentage of IFN-gamma (P < 0.01) and CD8+ T-cells (P < 0.05) was higher in SETD2 mutant group than in wild-type subgroup.
This study involved several limitations. First, most of the patients in our studies were treatment-naïve for any anti-cancer therapy, which might present lower PD-L1 expression levels than after-line patients. Second, missing of some clinical diagnostic data like cancer stage and tumor site may lead to a less detailed analysis on the clinical impact on PD-L1 expression. Third, due to lack of clinical survival data like PFS and OS, we used TCGA data to evaluate the influence of PD-L1 expression on clinical response. Therefore, there were an inconsistence between stratifying patients by TPS in our study and by a quartile method in TCGA database. Besides, the current sample size might be small for patients with common drive genes like ALK and EGFR when investigating on the roles of these gene mutations on PD-L1 expression. These may cause some statistical bias finally. Further study with larger sample size are planned in the future.

Conclusions
In summary, our study illustrated a clearer genomic landscape in Chinese lung adenocarcinoma patients of PD-L1 expression and relevant immune signatures from public database for interpreting the potential molecular mechanisms for clinical immunotherapy in NSCLC.